Combinatorial issues in text-to-speech synthesis

نویسنده

  • Jan P. H. van Santen
چکیده

Enhanced storage capacities and new learning algorithms have increased the role of text and speech training data bases in the construction of text-to-speech systems. It has become apparent, however, that not always learning algorithms are available that have strong generalization capabilities – the ability to generalize from cases seen in the training data base to new cases encountered during TTS operation. This makes it important to measure and understand the degree of coverage of the input domain of a text-to-speech system (usually, the entire language) by a given training data base. The goal of this paper is to investigate the feasibility of coverage in several domains of interest for TTS. It is shown that, as a result of the combinatorics of language, coverage is typically quite disappointing. This puts a premium on the generalization capability of learning algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...

متن کامل

Prosodic modelling in text-to-speech synthesis

This paper discusses three broad obstacles that must be overcome to improve prosodic quality in text-to-speech systems. First, direct and indirect limits set by the signal processing (“synthesis”) components. Second, combinatorial and statistical constraints inherent in generalizing from training corpora to unrestricted domains, and that require the integration of content-specific knowledge and...

متن کامل

Prosodic Modeling in Text-to-Speech Synthesis

This paper discusses three broad obstacles that must be overcome to improve prosodic quality in text-to-speech systems. First, direct and indirect limits set by the signal processing (“synthesis”) components. Second, combinatorial and statistical constraints inherent in generalizing from training corpora to unrestricted domains, and that require the integration of contentspecific knowledge and ...

متن کامل

مراحل و نحوه ی تهیه ی دادگان های صوتی هجایی و دایفونی برای سامانه ی تبدیل متن به گفتار فارسی

Abstract Speech databases are part of the concatenative text to speech synthesis systems. Phonetic quality of the databases plays a significant role in the naturalness of the synthesized speech. This paper introduces two syllable and diphone speech databases for Persian and investigates the way of their development and their specifications and their advantages to each other. ...

متن کامل

L2 Learners’ Lexical Inferencing: Perceptual Learning Style Preferences, Strategy Use, Density of Text, and Parts of Speech as Possible Predictors

This study was intended first to categorize the L2 learners in terms of their learning style preferences and second to investigate if their learning preferences are related to lexical inferencing. Moreover, strategies used for lexical inferencing and text related issues of text density and parts of speech were studied to determine their moderating effects and the best predictors of lexical infe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997